Orthogonal latin rectangles 

Roland Haggkvist 

1^ — Matematiska institutionen, Umea universitet, 

§ S-901 87 Umea, Sweden 

<N and 

^ Anders Johansson 

^ N-institutionen, Hogskolan i Gavle 

^ S-801 76 Gavle, Sweden 

^ Email: ajj@hig.se, rolandh@math.umu.se 

o 

O Sep 21, 2004 

a 

^ Abstract 

We use a greedy probabilistic method to prove that for every e > 0, 
every m x n Latin rectangle on n symbols has an orthogonal mate, 
where m = (1 — e)n. That is, we show the existence of a second latin 

^ rectangle such that no pair of the mn cells receives the same pair of 

^ symbols in the two rectangles. 

o 

o 1 Introduction 

This paper was inspired by a problem posed by Anthony J. W. Hilton at the 
^ thirteenth British Combinatorial Conference 1991 [6]. The problem is: 

.1^ Let i? be an n X 2n Latin rectangle on 2n symbols. A partial 

^ transversal T of size s of i? is a collection of s cells, no two in the 

d same row or column, and no two containing the same symbol. 

Is it true that R can be expressed as the union of 2n partial 

transversals of size n? 

An equivalent formulation: Call two nx2n Latin rectangles R, S 
on the same set of symbols orthogonal if the pairs {rij,Sij), for 
i = 1, . . . ,n and j = 1, . . . , 2n, are all distinct. Does every nx2n 
Latin rectangle have an orthogonal mate? 
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The updated problem list from the British Combinatorial Conferences from 
number twelve and upwards can be found electronically as a link on the 
homepage of the British Combinatorial Conference. 

While it is quite easy to sec that every n x 4n Latin rectangle has an 
orthogonal mate we know of no argument that solves the problem for nx3n, 
when n — 100, say. No doubt such an argument can be found eventually. 

In the current paper we use probabilistic methods to prove a far stronger 
statement than Hilton proposed, but only valid for large n, namely that for 
every e > 0, every (n — en) x n Latin rectangle has an orthogonal mate for 
large enough n. 

We know of no example of an (n — 1) x n Latin rectangle without an 
orthogonal mate, but would not be too surprised if such an example could be 
constructed. This ties up with well-known conjectures and results concerning 
the length of partial transversals in latin squares. Recall that Ryser [8], 
Brualdi [4, s. 103] and Stein [11] have conjectures (in particular Stein has 
much stronger conjectures, one of which was refuted by Drisko [5]) which 
imply that every {n — l)xn latin rectangle has a transversal of length n — 1. 
In this context we also recall some standard results on the length of partial 
transversals in latin squares, to viz: every n x n latin square has a partial 
transversal of length at least n—\Jn (proved by Woolbright [12], and Brouwer, 
de Vries and Wieringa [3]), and n — 5.53(logn)^ proved by Shor [10]. 

1.1 The result once again 

We consider m x n-latin rectangles on n symbols, n columns and m rows, i.e., 
an assignment to any cell in an m x n-table one of n symbols such that each 
symbol occur exactly once in each row and at most once in each column. 

Two latin m x n-rectangles, L and J, are orthogonal if the following holds: 
For any two colours a, (3 the colour-classes L~^(a) and J^^(/9) intersect in at 
most one element. Equivalently, each colour class L^^(q;) is a transversal of 
J and vice versa. 

Theorem 1.1. For every e > there is an Uq = no{s) such that to any 
m X n-latin rectangle ^, n > n^ and m — n{l — e), there is an orthogonal 
companion L. 

Remark 1.1. With some extra effort it is perhaps possible to prove Theo- 
rem 1.1 for all s = u! (n~^/^). However, it will be clear from the proof that 
in order to reach e < n~^/^ some new ideas must be found, if indeed the 
theorem is valid in this range. 

The basic method is related to nibble-methods used to colour graphs 
having "near disjoint" cliques. An orthogonal companion L of J can be 
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thought of as an n-colouring of the graph where the mn cells are vertices 
and where each row, column and J-colourclass make up a clique, i.e. a 
complete induced subgraph. The recent monograph of Reed and Molloy [9] 
contains many result in this area, e.g. J. Kahn's result [7] on edge- colourings 
of near-disjoint hypergraphs. 

There are elements of this proof that we belive are new. First of all, we 
have a distinguished parallel class of cliques of size n — corresponding to the 
rows — that we colour with n colours. In other words we have no slack here. 

In the proof, we construct an orthogonal companion in a random greedy 
manner by adding one row at a time. We use a process, q*, t G [0,m], of 
"fractional latin rows" to guide the greedy extensions, so that q*(i + 1, •, •) e 
]]^k:x5 gjygg ^]-^g expectation of the row, t + 1, added at time t. We maintain 
the "legality" of q* by setting q^(i,k,^) = if the cell {i,k) belongs to a 
J-colourclass or column already coloured with symbol 7. 

By analysing the time evolution for certain statistics of the random pro- 
cess q*, we deduce that, with positive probabihty, q* can be legally maintained 
for all times i = 0, 1, . . . , m — 1 so that the latin rectangle L is constructed 
at time m. 

1.2 Rows, columns, cells, diagonals and points 

We will think of a vector / in Cartesian space R"^ as a real-valued mapping 
/ from the index set A. Pointwise relations extends to relations between 
vectors in the natural way, e.g. f < g means that /(a) < g{a) for all a E A. 

Let TZ = [l,m] := {1,2, ... ,m} denote the set of rows, )C denote the 
set of columns and S the set of symbols. Thus, \TZ\ — m — n{l — e) and 
|/C| = \S\ — n. We refer to elements of 7?. x /C as cells. Let J be the 
given latin rectangle from Theorem 1.1. A diagonal is a set of cells assigned 
a common colour by J and the family of diagonals is denoted by T). The 
elements of A* := 7^ x /C x 5 we refer to as points. If nothing else is stated, 
we assume that the variable i refer to a row, the variables k, I refer to columns 
and a variable 7 refers to a symbol. The variables x and y will be preferred 
for points. 

We have, for V = TZ, IC, V, S, mappings X V assigning to each point 
the unique row, column, diagonal or symbol to which it belongs. We usually 
say that a point x & X belongs to the corresponding row, column, diagonal 
or symbol a &V, when L-p{x) — a. 

A line is a set of points with two of these coordinates fixed, i.e. a line is 
the set of the form {t-p x LQ)^^{a, f3) with a eV and (3 E Q. We introduce 
for any pair of two distinct coordinates V and Q the mapping ipQ assigning 
to each point the corresponding line to which it belongs. More precisely, we 
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let 

^VQix) := {iv X tQ)~\tr x i^q{x)). (1) 

The collection of lines make up a linear hypergraph on the points X, i.e. for 
every pair of distinct points x,y & X there is at most one line containing 
them. 



1.2.1 Latin rectangles 

A rectangle can now be identified with a {0, l}-valued vector L G {0, 1}'^ 
in the obvious way: For a point x = {i,k,j), L{x) = 1 if 7 is the symbol 
assigned to cell (i, k) and L(x) — otherwise. 

A rectangle L e {0, l}*^ is a latin rectangle orthogonal to J exactly when 
the relations 

E '-(2/) = l, E '-(l/) = l, (L) 

E '-(i/)<i, E •-(?/) <i, (c) 

hold for all x e X. The relations in (L) and (C) define a polytope £ C [0, 1]"^ 
so that a latin rectangle L is a {0, l}-valued element of this polytope. For 

our purposes, rational latin rectangles orthogonal to J are vectors in 2. The 
constraints in (L) are local to each row since they concern lines contained in 
rows. The constraints in (C) are then central constraints since they concern 
lines transversal to the rows. 



1.3 The greedy latin rectangle process 

We now give a birds eye view of the proof. The probabilistic terminology 
used regarding vector- valued random processes is made precise in section §1.4 
below. Our purpose is to construct an increasing random process, a greedy 
rectangle process, L* G {0, 1}'^ of partial J-orthogonal latin rectangles that 
proceed row-wise: Initially, L° = and at each tick of the clock, i.e. when 
t I— > t + 1, we extend — if the situation allows it — the partial latin rectangle 
L* to a partial latin rectangle L*+^ having the row t + 1 added to the latin 
rectangle. The time variable t G [0, m] :— {0, . . . , m} thus corresponds to 
rows being added to the rectangle. The process is successful if L"* actually 
produce a full J-orthogonal latin rectangle. 

It is quite easy to see that such a greedy rectangle process should always 
be successful as long as m < n/4. To see this, note that the legal choices 
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of the added row t + 1 are, by the local constraints, given by matchings 
in a "legality graph", which is the balanced bipartite graph consisting of 
those symbol-column pairs for row t + 1 that arc not in conflict with any 
previously added row due to central constraints. Moreover, each previously 
added row can exclude at most two symbols for the column k on row it+ 1; one 
symbol on the same column and one symbol on the same diagonal. Similarily, 
each added row excludes at most two possible columns for any symbol 7. 
Thus, since t < m < n/4 arc previously added, the legality graph will have 
minimum degree at least n — 2t > n/2 and a well known degree condition 
based on Halls theorem ensures the existence of a legal matching for row 
number t + 1. 

This naive argument can be extended significantly when the obtained 
legality graph is sufficiently random-like to ensure the existence of a perfect 
matching for degrees well-below n/2. To achieve this, we need to introduce 
some probabilistic tools. 

A central idea is to let the greedy rectangle process L* be "guided" by a 
Markov process p* e [0,1]-^, t e [0,m]. We refer to p* e [0, l]'^ as a state. 
The initial state is the uniform vector p° = ^. The relationship between 
the processes L* and p* is that, at time t, p^{x) approximately gives the 
expectation of L*(a;), for points x belonging to rows that are coloured at time 
s > t. 

Care must therefore be taken in the construction of p*, so that L* never 
violates the local and central constraints, (L) and (C). We defer the exact 
definition p* to section §1.5 below. We note here that the construction of p* 
ensures that the central constraints (C) are never violated by L*: If a cell 
{t + l,k) in the active row is assigned the colour 7 at time t then we remove 
the possiblity that any cell in the same column or diagonal later gets colour 
7. Hence, we must "kill" all points y belonging to the central lines going 
through the point {t + l,k,j), that is, we set 

p*+i(y) = p*+2(y) = ... = 0, (2) 

for all y belonging to such a central line. 

Given t e [0,m], we define a region T C [0, 1]'^, where p* G F should be 
interpreted as stating that p* is a "good state" . The exact definition of F is 
deferred to §1.6 below, but we mention that F is defined by three collections 
of inequalities: The first group of inequalites, (Ax), bounds the size of the 
individual values p*(a;) while the second group, states that p* almost 

should satisfy the local constraints (L). 

Note that, for a fixed row i E TZ, the local constraints given by (L), defines 

a polytope 2i in = R^^'^ which can be interpreted as the polytope of 
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rational (perfect) matchings in the complete bipartite graph K{]C, S) and we 
refer to {0, l}-valued vectors in £j as matchings on that row. 

Lemma 1.2. //"p* e F then, for each row i there is a rational matching 
q* e £i such that for all (A;, 7) e /C x 5 



where b is an abbreviation of the asymptotic expression O iy/n log n 



{See (6) below.) 

We prove this Lemma in section §2 using the Ford- Fulkcr son Theorem; 
in the proof, the third and last group of inequalities, (Ci^ki), which give a 
"quasi-random" property of p*, are central for the construction. 

Now recall the well-known characterization by Birkhoff [2] stating that 
any rational matching e £j can be expressed as a convex combination 
Qi — 5^J^^ cmM of matchings M & £i. By interpreting the convex coefficients 
Cm as probabilities, where we pick the matching M with probability Cm, 
Birkhoffs theorem can also be given the following formulation: Given any 
rational matching qj G it is always possible to find a random matching 
Lj e £i, such that the expectation of U equals q^. 

Therefore, modulo the precise definition of F and p*, we can define the 
greedy latin rectangle process L* by iterating the following procedure for 
t — 0,1, . . . ,m — 1. 

Extend If p* e F then choose a rational matching ql_^_l on row t -|- 1 which 
satisfies (3). Then draw, using the random construction implied by 
Birkhoffs theorem, an extension L*+^ such that 



The new state p*+^ is then constructed from p*, qt+i and L*+^ according 
to the construction in (8) below. 

Stop If p* ^ F then simply let L'* = L* and p'^ = p* for all s, t < s < m. The 
greedy latin rectangle is then said to be unsuccessful. 

On account of the killing mechanism (2), the bound (3) and the property (4), 
the construction ensures that the central constraints are never violated by 
L*. Thus, if p* stays in F, the process produces an orthogonal companion L"* 
to J at time m and the probability of an unsuccessful rectangle process is the 
probability that p* leaves F for some t G [0,m]. The proof is thus concluded, 
if we, after properly describing the construction of p* and the definition of F 
and proving Lemma 1.2, in addition, prove the following lemma. 



ql{t,k,j)<{l + b)p\z,k,j). 



(3) 




(4) 
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Lemma 1.3. For all t — 1,2, ... ,m — 1 we have that 



P{p* e r} = 1 



— n 



Ml) 



(5) 



Note that (5) implies that the probabihty that p* stays inside F at all 
times t is of probability of order 1 — = 1 — n~'^^^\ 

1.4 Probabilistic preliminaries and asymptotic nota- 
tion 

1.4.1 Asymptotic notation 

We will use the standard asymptotic notation, O(-), o(-), u;{-), ^^(•), etc., 

where all are interpreted as asymptotic estimates relative the limit n oo. 
That is, / = O (g) if and only if limsup^^^^o \ f /ol < / = ^ (q) if ^'^d 
only if liminf„^oo IZ/S'I = oo, f = (2 (g) if and only if limsup^^^ \g/f\ < oo, 
f ^ o{g) if and only if hmsup„^^ = and / = O (gr) if and only if 
limsup„_^(|//^| + \g/f\) < oo. 

Such asymptotic expressions are often used to estimate the components of 
vectors and, of course, if we have such a local quantity expressed in terms of 
some asymptotic expression, then all implicit constants in the asymptotic ex- 
pression are assumed to be independent of the particular point, row, column 
et cetera at which the local quantity is defined. 

Since some asymptotic expressions are extensively used, we also introduce 
the following abbreviations of asymptotic expressions 



a-.^Oi'-^], b :=0(n-V>gn)V2), p := O ^ . (6) 



Note that ^/a — b. 

1.4.2 Probabilistic terminology and notation 

The proof will use a dynamic probabilistic method, so we introduce some 
concepts and terms from probability theory. We will construct a filtered 
probability space P{-} , JF*) with a discrete time-variable t taking values 
in [0, m] := {0, 1, . . . ,m} and we can assume that the probability space we 
work with is finite. The finite algebras {J^*}, t G [0,m], is an increasing 
sequence of subsets of 2^ with — {0, fl} . In our case, captures the 
random operations used for adding the first t rows to a latin rectangle. A 
random variable is determined at time t if it is jF*-measurable. 

We will work with vector valued random variables without explicitly not- 
ing this: A (vector- valued) random variable is a mapping X : Q — > R"^ from 
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Q to a Cartesian space R"^. For our purposes, a stochastic process is a func- 
tion X : f2 X [0, m] — > M^. We write X{u!, t) simply as X*. Functions defined 
for t E [0, m] will generally have the variable t as a superscript. We suppress 
the dependence on uj E Q for the random entities. 

A process X* is adapted if the value of X* is determined at time t and 
we usually assume this to be the case without further notice. A process X* 
is previsible if the value of X* is determined at time t — 1. 

The expectation operator E[-] and the probability P{-} refers to the un- 
conditional probability, while the temporal conditional expectation is de- 
noted by Kt[f] = E[/ I JF*] and the conditional probability by Pt{A} — 
F{A I J^*}. 

The various expectation operators apply to vectors so that if F is a ran- 
dom vector taking values in a Cartesian space M"^ we mean by K[F] the vector 
in given by E[F] (a) = E[F(a)], a e A. 

An adapted process X* is a martingale (super-martingale or submartin- 
gale) if Et[X*+i] = X* (Et[X*+i] < X* or Et[X*+i] > X*). 

A stopping time is a random time r : Q — > [0,m] U {oo} such that 
the event {r < t} G JF*. We will work with vectors of random times, i.e., 
mappings t : Q x A ^ [0,m]U {oo} . 

Given a vector-valued process X* e M^, t e [0,m], then X*^^ is the 
process whose value at a e A at time t is X*(a) ii t < T(a) and X'^(")(a) 
otherwise. (We use s At as a, shorthand for min{s, t}.) 

If X* is adapted and r is a vector of stopping times then X*^^ is adapted. 
We say that an adapted process X* is stopped at a vector of stopping times r 
if X* = X*^"^. If X* is a supermartingale and r is a vector of stopping times 
then the process X*^"^ is a supermartingale. 

Before proving Lemma 1.2 and Lemma 1.3 in the following sections, we 
first proceed to define the process p* rigorously as well as the good set P. 

1.5 The construction of 

For ease of notation, we define a deterministic vector : X ^ [0, m] of 
colouring times, so that, for a point x = {i. A;, 7), rc{x) = i — 1 is the time 
when the row i is added to the latin rectangle process. 

1.5.1 The killing 

Consider a point x = (i,k,^) G X such that Tc{x) ^ t. For a central line 
^ = ^K-S OT £ = i-Ds, let £^{x) be the unique point on the active row t + 1 
lying in the line £{x), x e X. If any of L*+^ o £ics{x) and L*+^ o ^^isC^^) ^^^^ 
the value 1 then the point x is killed at time t. 
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The two "projections" of x, -^^5(2;) and i^x>si^)^ distinct points and 
belong to a common local line ^nsi^^Ksi^)) ~ ^nsi^vsi^)) t + 1. 

The indicators L*"*"^ o ^^^(a;) and L*"*"^ o £\)g{x) can therefore never take the 
value 1 simultaneously and we can write 

K*+i(x) = L*+^ o i^ix) + L*+^ o ^^(x), (7) 

for the indicator K*"''^(a;) e {0, 1} of the event that the point x is killed. 

1.5.2 The construction of p* 

Initially, we set p°(a;) = - for all a; e A*. We define the global stopping time 
T marking exit from F, i.e. 

T :— min {t e [o,m - 1] : p* ^ r} U {00}. 

For t > T the greedy process is thus in effect "stopped" and the greedy 
random colouring has failed if T < 00. 

Therefore, for i < T and x E X such that Tc{x) > t, define 

and for t < T and t > Tc{x), let p*+^(a;) = p*(a;). 

By the definition of p*"*"^ above, the process p* is stopped both at the 
global stopping time T and at the deterministic stopping time vector Tc : 
X [0,m], i.e., p* = p*^^cAT_ 

1.5.3 Some properties of p* 

For t < T, i.e. as long as p* e F, we can, by Definition 1.1 below, assume 
that 

p* < p . (9) 

with the asymptotic abbreviation p = 0(l/n) as in (6). (We understand 
that a relation like (9) holds with the same implicit constant at all points 
xeX.) 

Since the vector 1 — K*+^ e {0, l}'^ indicates survival, the definition 
ensures that all points a; G A" such that Tc{x) > t + 1 and p*"'"^(a;) > can be 
used to extend L*+^. It follows that L* is indeed a process of legal (partial) 
J-orthogonal latin rectangles. 
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Note that, from (3), (7) and (9), we have that 



= i-po45-po45-bp (10) 
= + 

Hence 

p*(x)<p*+^(x)<p*(x)-(l + p), (11) 

unless X is killed, i.e. unless p*"'"^(x) = 0. 

By the construction (8), the process p* is a martingale, so that 

Et[p*+^]=p*. (12) 

Relations (11) and (12) will be essential in deriving the concentration results 
upon which the proof is founded. 

1.6 The definition of F 

The process p* is controlled by keeping a set of local inequalities alive through 
the iterations. These local relations make up the notion of "goodness" that 
we rely on throughout the arguments. First, for x — {i,k,'j) e X, let the 
inequality (^4^) be defined by 

p\x)<l.le-'-. {A,) 
n 

Secondly, for a local line £ = dnfcAns and a; e A", we say that (-^^(a;)) hold if 

Finally, for A; 7^ / e /C and i e 7^ let {Ci^ki) be the "quasi-random" inequality 

?\h k, 7) P\i, 7) < - (1 + ^)- 

7 

Definition 1.1. We say that the state p* G [0, 1]'^ is good if (A^), {Bi^^^i^^-^) 
and {Bi^^(^x)) hold at all a; G X and, in addition, {Ci^ki) hold for all /c,/ G /C 
and i eTZ. We write F for the region F C [0, 1]"^ of good states. 

It is trivial to check that the initial state, p° = 1/n, is an element of F. 
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1.6.1 Asymptotic relatxations 

The precise formulations of (A^), (-^^(a;)) and (Ci^ki) are needed to make T 
well defined, but that precision is otherwise not critical. What we actually 
will use in the computations below are the following less precise asymptotic 
statements 

p*(x) = p, (A) 

i{x) 

and 

J^p*(^,^,7)p*(^,/,7)<(l + a)-. {C) 

7 

It should be noted that the states p*, if they eventually leave V at the 
time T, will stay quite close to T. This is due to the fact that we stop 
p* at time t — T, with the previous state p^"-*^ being a good state. As a 
consequence, we can use the somewhat relaxed bounds {A), (B) and (C) in 
our arguments, without considering if we are conditioning on t < T or not, 
since these bounds then hold for all times t. In particular, we can assume 
the relaxed bounds when we later show that p* with high probability stays 
inside F. (For definiteness, one may choose to replace the implicit constants 
in (^4), (B) and (C) with explicit constants shghtly larger or smaller than 
those used in the definition of F.) 

In order to see that p"^ is close to F in this sense, note that from (11) it is 
clear that the left hand side of (^4) can only increase with a fraction 1 + p at 
a time. Similarily, the left hand side of (C), can at most increase by a factor 
(l + p)^ = l + aat time. Finally, from the fact that at most two terms (see 
argument preceding (35) below) in the sums in (B) can be killed at a time, 
it also follows that the left hand side of (B) can only change with a fraction 

1 ± p at a time. 

2 The proof of Lemma 1.2 

The reason behind introducing the set F is Lemma 1.2 stated in the intro- 
duction. This lemma shows the existence of q* and hence lets us define the 
extension L*+^ of L* as long as p* e F. We now proceed to prove this lemma. 
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We only have to look at one fixed row i E TZ ai a, time. Given a good 
state p* e r, let p e [0, 1]'^'''^ be given by 

p{k,-f) :=p*(i,A;,7)/^p*(z,/,7), (13) 

I 

so that p = (1 ± a) p* by (B) and p is normalized at each symbol, i.e. 

J]p(A;,7) = l. (14) 

k 

We assume that p* satisfies the inequalities (A), (B) and (C), which translates 
to the following set of inequalities 

p(A;,7)=p, (15) 

^p(A:,7) = l + a, (16) 
■yes 

J]p(/c,7)p(/,7)<(l + o)/n, (17) 
■yes 

for all values oi k,l E JC and 7 e 5. 

We want to show that for some rj — b there exists a rational matching, i. e. 
a vector q G [0, l]'^^'^ such that for all k and 7, X]fe' Q{k', 7) = X^y q{k, 7') = 
1, that in addition satisfies 

q{kn)<{l + v)-p{kn): V(A;,7) e/Cx5. 

Such a rational matching can also be defined as a fiow on a directed graph 
from a source s connected to all vertices /c e /C to a sink t connected to all 
vertices 7 e <S. The flow should take the value 1 on each edge sk, k e /C, and 
each edge jt, 7 e 5. On the remaining edges, of the form k^, we prescribe 
the capacities q.^, = (1 + ri)p{k,^). The Ford-Fulkerson theorem says that it 
is enough to show that for all pairs of nonempty sets A C. S and B C. )C we 
have 

2n-\A\-\B\ + {l + ri)J2pif^n)>n, (18) 

keB 

since the left hand side is the capacity for a cut in a flow deflning a rational 
matching where the capacity of edge k^ equals {1 + r)) • p{k, 7). 

Given an arbitrary pair of subsets A and B as above, we shall prove that, 
for some rj = b (with the implicit constant independent of A and B), the left 
hand side in (18) is not strictly less than n if we assume that p satisfies the 
inequalities (15), (16) and (17). 
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Hence, assume that (18) does not hold and proceed to derive a contra- 
diction. It must obviously then be the case that \A\ + \B\ > n and both A 
and B must be non-empty. Let a :— \A\/n, b :— \B\/n. Then, a,b > and 

a + b>l. (19) 

For 7 e 5 and S G IC, let 

kes 

and define 

It is easy to see that, by dividing both sides in (18) by n and substituting 
for a, b and x, the assumption that (18) does not hold is equivalent to 

{l-a){l-b) + abr]-ax{l + r]) <0. (21) 

In order to contradict (21), it is enough to show that 

ax < ab b, (22) 

since we can take rj equal to, say, two times the b-function on the right hand 
side of (22). 

We claim it follows from (19) and (15)-(17) that 

\ax + (1 - a)y\ < ab ■ a, (23) 

and that 

ax^ + {l-a)y^ <ab-a. (24) 

Postponing the proofs for the two relations (23) and (24) until later, we 
proceed to show that they imply (22). We divide into two cases depending 
on the value of a: If a < 1/2 then (24) gives 

ax"^ <aba — ab^ o, 

where we use that (19) implies that 6 > 1/2 so that a — ba. Multiplying 
both sides with a and taking the square root gives (22) since ^/a — b. 
In the case a > 1/2 then (24) gives that 

(1 - a)y^ <aba 
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and since 1 — a < 6 by (19) we can multiply the left hand side by 1 — a and 
the right hand side by b. This gives 

(1 - afy^ < a - a^b"^ a, 

where the last inequality is due to a > 1/2. Taking the square root of this, 
substituting in (23) and using the triangle inequality we get 

a\x\ < ab a + {l — a)\y\ < aba+ab b = abb 

and (22) is proved. □ 
(It follows that we can take rj to be four times the square root of the 

maximum o-function found in (23) and (24).) 

What remain now is to show that (23) and (24) follows from (19) and the 

goodness assumptions (15), (16) and (17). Note that, by (19), 

aba>{l-b)ba> (1/2) min{6, 1 - 6} a = min{6, 1 - 6} a . 

Consequently, it is enough to show (23) and (24) for the right hand side /3 a, 
P E {6, 1 — 6} , instead of ab a. 

Moreover, since for each 7, p{B, 7) = 1 — p(/C \ 5, 7), the value x and y 
will both change only in sign if we interchange B with IC \ B in (20). Thus, 
if we do not use the assumption (21) (and e.g. (19)) about B, we may freely 
interchange B and JC\B. This means that we only have to consider the case 
P^b. 

We first show that \ax + (1 — a)y\ < ba: We have that \A\{b — x) + {n — 
\A\){b — y) equals ^yP{B,j) which, by (16), is of order |i3|(l + a). Dividing 
by n gives 

a{b — x) + {1 — a){b — y) — b + ba <(=^ \ax + {1 — a)y\ — ba. 

In order to see that that ax"^ + (1 — a)l/^ < 60, we let zm — X]^g5P(^, 7) • 
p{l,^) denote the left hand side of (17) above. Note that, 

Y,p{B,lf=Y.zu + Y.P"i^^l)- (25) 

7 fc,ZGB 7 

k^l k&B 

The last sum in (25) above is of the order \B\ p by (15) and the first sum 
on the right hand side is less than (1 + a)/n by (17). Furthermore, the 
Cauchy-Schwartz inequahty implies that the left hand side of (25) is greater 
than 

mY.P^B,^)f + (n - \A\)C£p{B,^)f = 

\A\{b-xf-r{n-\A\){b-yf. 
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Thus, we have 

1^1(6 - + {n- \A\){b - yf < \B\^ (1 + a)/n + \B\ p 
and dividing by n and expanding the squares gives the relation 

ab'^ + (1 - a)b^ - 2b {ax + (1 - a)y) + ax^ + (1 - a)y'^ <b^ + b^a+ba 

which is equivalent to 

ax"^ + (1 - a)y'^ < 6^ a +6 a +26 (ax + (1 - a)y). 
The sought statement follows if we use (23) to estimate the last term. □ 

3 The proof of Lemma 1.3 

For the parameter n tending to infinity, we stipulate that an event has very 
high probability if it holds with probability having asymptotic order 1 — n"'^^^) . 
An inequality of the form X* < f{t) that, for all t e [0,m], holds with 
very high probability is said to be stable. Note that, since we only consider 
^0(1) different times t, a stable inequality holds with very high probability 
simultaneously for all t G [0,m]. 

In order to conclude the proof of the lemma, it suffices to show that, for 
every possible value of x,i,k,l,j and i 

the inequalities {A^), {Bi(x)) and (Ci^ki) defining T are all stable. (26) 

Since the definition of F considers O (n^) such inequalities, the probability 
that p* ^ F is then shown to be of order O (n"'^*^^^"'"^) = n"'^'-^-'. 

In the first subsection, we state and prove a more general "concentration 
lemma" Lemma 3.1 and then we prove, in three separate subsections, the 
stability of {Bi(^x)), {Ci^ki) and {A^) where we regularly invoke Lemma 3.1. Of 
the three, proving the stability of {A^) involves the most complex argument, 
but is should be noted that analysing the structure of the linear hypergraph 
given by the lines is also an essential component to derive the other two 
statements. 

Again it should be noted, as in §1.6, that, since we stop p* at time T, we 
can use the bounds {A), (B) and (C) in our arguments to derive the stabihty 
of (-B£(a;)), (Ci^ki) and (Ax), without considering if we are conditioning on 
t < T or not. It should be clear that we at no point make the assumption 
that the process p* is unstopped. In particular, the conditions and conclusions 
of Lemma 3.1 work for stopped processes as well as "live" processes. 
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3.1 A concentration result 



The following lemma is a consequence of A zuma-Hoeff ding's inequality, see 
e.g. [1]. In order to make the subsequent invocations transparent, we put the 
lemma in a suitable form and make no attempt to derive the best possible 
result. 

Lemma 3.1. Let ^ and q;°, a^, . . . , a^~^ he positive numbers such that ^/n-\- 
a)logn = o{\), where a :— X^^o^ct*. Let X* > 0, t & [0,m\, be a positive 
process such that 

_ [X*+i] I < e max{X*, X°} (27) 

and 

Et - X* < a* max{X*, (28) 

Then, for all t e [0, m], 

P{X* < (1 + $)X"} = l-n-'^W, (29) 

for any $ = O {{^^/n -\- a)\ogn) . Furthermore, if X* is a martingale (in 
which case a = 0) then the reverse inequality 

X*>{1- $) X° (30) 

is stable as well. 

Proof of Lemma 3.1 . Note that the inequalities (27) and (28) are still valid 
if we stop the process at any stopping time r. If we take r to be the first 
time that X* > 2 • X^ then, since 

X'+^ < (1 + C + a") X" = (1 + o (1)) X' 

we can assume that the stopped process X* = X*^^ < 3X°. It is therefore 
enough to prove the stability of (29) with the additional assumption that 
X* < 3 • XO for all t. 

Consider the martingale M* := X]l=o {^^^^ " IEs[X''+^]) , and the previs- 
ible process := X)t=I) (^4^'+^] - X') . The following identity 

X* = X° + A* + M* (31) 

is called the Doob decomposition of X*. We refer to the terms of A* as drifts 
of X* and to the terms in M* as deviations. 
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On account of the bound X* < 3X°, we obtain from (28) that A* < a 3X°. 
and, from (27), that |M*+i - Et[M*+i] | < ^3X°. Since M* is a martingale, 
the Azuma-Hoeffding inequality implies that 

P{|M*| > A} < exp{-AV4(ev^3X°)2} 

if 

A = e(ev^(logn))-X°. 

It is then also seen that |M*| > A is an event of probability of order n~^'^^\ 
Hence, the probability X* - > where 

= /2(alogn)X° + /2(Cv^logn)X° 

is of order n~'^^^^; in the display above the first term is greater than A* < 
3aX^ and the second term bounds M* within very high probability. This 
proves (29). The stability of (30) follows in the same manner from the 
stability of the event M* > —A. (In this case A* = 0, since X is a martingale.) 

□ 



3.1.1 An estimate of ^ for a certain type of sums 

The inequalites we are dealing with have a common form and we will re- 
peatedly use the formula in (33) below in order to to estimate the deviation 
parameter ^ used in Lemma 3.1. 

The processes we consider are sums 

x' = J2x!: xl>o 

for some index set J^. The terms have uniform bounds 

< X* < m, 

for some asymptotic expression m. It will also be the case that each term 
Xf > in the sum changes moderately in the following sense: For ^_ > 0, 
we have for all t e [0, m] and i E J that 

(l-e-)^*<X*+^<(l + e+)X*, (32) 

unless the term Xf is killed, i.e. unless Xf'^^ — but Xf > 0. We assume 
that the maximum number of such terms killed is furthermore given by J\f. 
The following lemma is then immediate. 

Lemma 3.2. WithX , m, H , ^+ and^^ as above the conclusions of Lemma 3.1 
hold with 

5 = (33) 
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3.2 The stability of (Bk^x)) 

Let i{x) be any local line, i.e. £{x) = inicix) or i{x) = insi^)- ^oi t G [0,m], 
let 

X' E P'(yy (34) 

Our objective is to show that (1 — $) < X* < (1 + $) with very high 
probability, where $ = \ogn/y/n. By the martingale property (12), the 
drift, a, for X* is zero and we have that X*^' = 1. Unless p*~^^{y) = we 
have, by (11), that p\y) < p^~^^{y) < (1 + p) p\y)- Thus, in the notation 
from Lemma 3.2, we have ^- = and ^+ = p. Moreover, each term in (34) 
is smaller than p (by (A)), so we have m = p and hence, by Lemma 3.2, 
e = AAp+0 + p. 

Thus, in order to show that ^ = p, it only remains to show that J\f — 
the maximum number of terms "killed" — is of order O (1). We claim that 
^f <2. The number of terms killed is Eye^(a;) K*+^(y) where K*+i = 1*+^ o 
'^KS + 1-*^^ ° ^vs- Moreover, the maps y ^ z — ifcsiv) y ^ ^ = ^vsiv) 
maps y G £(x) one-to-one into a corresponding local line z G ^i^lcsiv)) 
z G ^{ijjsiy))- Since L*+^ is latin this means that 

M<J2 1-*+' ° ^Uy) + E L*^' o 4^(1/) < 1 + 1 = 2. (35) 
y y 

□ 

3.3 Proof of stability for (Ci^ki) 

For a fixed i E TZ and k,l E IC, k ^ I, we consider the sum X* := ^^X* 
where for 7 G 5 

X* :=p*(iA;7)-p*(i/7)- (36) 

We have X° — 1/n and < p^ by (A). Our objective is to prove that with 
very high probability X* < {1 + ^)X^ where $ = logn/^Jn. By, Lemma 3.1, 
it is enough to show that ^ = p and a — p. These bounds are proved in (37) 
and (39) below. 

By (11), each term X^^ will either increase with a factor at most (1 -|- 
p)^ = 1-l-p or be killed. Furthermore, at most four terms can be killed, since, 
by the computation already done in (35), at most two points in each of the 
cells (i, k) and (i, I) are killed. Hence, with ^ as in Lemma 3.2 

4p2 

e<Yf + + p = p. (37) 
In 
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In order to calculate the drift a of X*, fix 7 e 5 and let Kk — K*+^(i, k, 7) 
and Tk = IEt[i^fc] ■ Note that = q* o lics{i, k, 7) + q* o £-Ds(h k, 7) = p. Then 

i^t+u ^ - K,){1 - Ki)] ^ 

*L T J ■ E,[{1 - K,)]E,[{1 - K,)] 

^ 1 - Tfe - + VkVi 

But, the event KkKi ^ can happen only when the diagonal through the 
cell (i, k) intersects row i + 1 in the column number / and vice versa and 
this can be the case for at most two values of t. For all other values of t the 
diagonals and columns through cells {i, k) and (i, /) intersect row t + 1 at four 
disjoint positions. For these t, at most one cell in row t + 1 is coloured by 
any colour 7, so at most one of the indicators and Ki equals one making 
Kj^Ki = 0. Hence it holds, for all but at most two values of t and uniformly 
for all 7, that ^t[KkKi] = and from (38) it is clear that < X* for 

these t. 

Also, for the two possible exceptional values of t, when Et[KkKi\ is posi- 
tive, we clearly have that 

¥.t[KkKi] <min{rfe,n} < p, 

even assuming the worst possible correlation. Prom (38), we deduce that 
Et[X*+^] < X* + amax{X°,X*}, with a = p. Putting this together gives 
that a = X^*lo '^^ Lemma 3.1 is bounded by 

a<(i-2)-0 + 2p = p. (39) 

□ 



3.4 Proving that (^42;) is stable 

The derivation of the stability of {Ax) is a bit more involved. The important 
part is perhaps the identity (46) below, which makes it possible to relate the 
growth of the product nl=o(-'- — ° •^*)~''^ to that of a sum = X]s=o Pl ° 
with suitable concentration properties. Our aim is to prove that with very 
high probability 

^<(l + o(l)) (l-(l + o(l))^ 

Since t/n < {1 — e) = 1 — ^?(1), the inequality (40) implies {A^) for large 
enough n. 
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Let i denote a central parallell class, i.e. i is either of or Ids- Define 
the vector of stopping times Tg = t^(x), x & X, giving the time that the 
central line £{x) is "killed", i.e. let 

n{x) := inf i t : ^ L\y) = 1 I U {oo}. (41) 
[ yei(x) J 

Also, let 

i{x,t) := t A (Te^six) A Te^si^) - 1) A T A Tc{x). 

Note that the value of i{x, t) is determined at time t for all t and that p*(a;) = 
p*(^'*^(x) is an adapted process which is increasing in t. Moreover, we have 
p*(x) = p\x) unless t > re^s{x) A re^s{x) and n^si^) ^ n-^si^) <T A rdx), 
in which case p*(a;) = 0. 

From the definition (8), we know that 



n n(i-p^or)-\ (42) 

Note that the first factor to the right is negligible: By {A) and the estimate 
in (10), it is of order (1 + b p + p^)* — (1 + b). The aim will thus be to show 
that 

n(l-p^o£r' = (l-(l + o(l)) V- (43) 

s=0 

where i = iics or i = i-ris- 

For i — ijcs or i = ix>s, define recursively the adapted process p^ e [0, 1]"^, 
t e [0,m], as follows: Let p%x) — p^{x) — ^ and set 

p*+^(x) := ~ ^^^'^"^^^ ■ P*^'^"^^ ^ - ^^'^''^ ~ 1) AT Are(a:) ^^^^ 
lp^(a;) otherwise 

where 

t-i 

S\x) :=^p|or(x). 

The definition of p^ in (44) implies, for all y € i{x), that either p^^^(y) = 0, 
Pl'^\y) = P\{y) or else 

^ ^ (l-p*or(y))pKy) 

l-q^o£^^(l/)-q^o£|,^(l/)- 
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Th6rGfor6 

pKy)-(i-p)<pr^(y)<pKy)-(i + P), (45) 

except for the case when the term p\{y) is killed. Note also that killing of pl{y) 
occurs exactly when Te'{y) = t + 1 < T^{y), where (!{y) is the complementary 
central line, i. e. = l]cs if i = ivs and vice versa. 

U t = t then p* = p^/ (1 — 5"*) and we deduce the identity 

n(i-P'on-'=nT^ = rr^- («) 



s=0 



In order to prove the stability of (40) it is therefore enough to show that for 
all X with very high probability 

S'°(x)<(l + o(l))(to/n), (47) 

where the time to ^ [0, m] is fixed but arbitrary. 

So fix to E [0, m] and let x E X be arbitrary. In order to ease the notation, 
we from now on suppress the dependency of x & X for most quantitites. Let 
^<*o(a;) .- {y e £(^x) : Tc{y) < to} . Define for t e [0, t] the variable X* e R;^ 

by 

Since p^ = p^^^% wc have that X*» = S^". Since |£<*o| = to and p^ = 1/n we 
have X° = to/n. Moreover, on account of (45), we have X* < (l + tp)X° and 
thus the sought bound, (47), clearly follows, if to — o{n). We may therefore 
assume that to is greater than, say, n^/^ and hence that X^ > n~^^^. 

Thus, we can conclude the proof by showing that the total drift, a, of X^ 
satisfies a < b and that the deviation parameter, ^, satisfies ^ = O 
Then, from Lemma 3.1, we can deduce that 

gto = < (1 + O logn)) (to/n), 

with very high probability. 

Prom the definition (44), we know that [p^'*"-^/p^] equals 1 in the case 
when t > Ti AT ATc and that otherwise it equals 

1 - 5"*+^ 

Ft{n ^t + 1}- i + Ft{n >t + i}- -jzrsr ' \re>t + l] 

= q* o + (1 - q* o t) . ]~^[°^' = 1 + q* o - p* o £*, 

1 — q'^ o 
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where we have used that ^-^^TJt^ = 1 — p* o ^* and that 'Et[p^'^^/ p* | > i + 1] 
equals 



1 - q* o ^ 1 - q* o 1 - qt o - q* o f * 1 - q* o ' 

where i' is the complementary line. 
Hence, for all t, 



p^- (l + q*o£*-p*o£*) tKTcATAre 
pI otherwise (48) 

<pl{l + pb). 

From (48) we can estimate the drift term 

a <tobp ^ b . 

Note that only one cell in the row i + 1 is coloured with the colour 7 
common to all points in £(x) and that this cell can lie on at most one crossing 
central line £'{y) intersecting the given central line £(x). This means that only 
one term in the sum Xl^<'o Pe above can be killed at a time. Putting this and 
the estimate (45) into the formula (33) gives 

C< l-p/X° + p + p = C»(n-2/3) 

□ 
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